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Correlations in music that exist within its waveform are studied. Monophonic wave files of ran- 
dom music are generated and the probability distribution function of time interval between large 
signal values is analyzed. A power law behavior for the distribution function in the range from 
O.lmillisecond to 20 milliseconds is observed. An attempt is made to investigate the origin of these 
correlations by randomizing each of the factors (frequencies, intensities and durations of notes of 
the random music files) separately. 

PACS numbers: 43.75.Zz, 43.60.Ce;, 43.75.Wx 
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I. INTRODUCTION 



Trying to quantify any form of art is a challenging task. 
The attempt here is to look into the waveform of music 
using statistical and computational methods and try to 
investigate the correlations that exist in music. Various 
studies have been done prior to this which analyze corre- 
lations in music. Voss and Clarkel did pioneering work 
in this field. Voss and Clarke analyzed the spectrum of 
time series of various genre of music including Classical, 
Jazz, and Rock taken from radio broadcasts. The du- 
rations of the samples extended to about 12 hours each. 
They observed prominent 1// spectral behavior in all the 
time series that they analyzed. They came to a conclu- 
sion that all "intelligent" behavior should show a 1// like 
spectral density. 

Beran Q argues that 1// behavior could be due to the 
instruments alone and not depend on the composers or 
the genre. He demonstrates that even one single note 
shows a 1// noise-like behavior. A whole musical piece 
is considered by most musicians to be the largest unit 
of artistic significance 0, H| . Since Voss and Clarke had 
included hours of recording from radio stations that in- 
cluded announcements and played different pieces of mu- 
sic, the analysis done may not be conclusive enough to 
make deductions about music [f| and how to quantify its 
aesthetic attributes. Boon and Decroly [3] analyzed 23 
pieces and got the value of the slope of the bi-logarithmic 
plot of the power spectrum to be consistently lying in 
between 1.79 and 1.97 which is close to red noise-like 
behavior. Their results also matched Nettheim's results 
[H ■ Both groups analyzed pieces of music which are con- 



sidered to be the largest units of significance of musical 
expression. Other attempts to computationally identify 
aspects of musicality have been made in the past. Ma- 
naris et al. [|| found that certain musical attributes ex- 
hibit the Zipf-Mandelbrot distribution (a power law dis- 
tribution on ranked data) and that statistical analysis of 
certain metrics defined on music can potentially be re- 
liable in computationally identifying aesthetic attributes 
of music. 

Voss and Clarke also applied their results to stochastic 
composition. This was however done using two separate 
melody and duration sequences which were both assumed 
to be separate 1// processes. The meaning of a spectrum 
of durations is unclear, for a duration itself moves along 
the time axis jij. Rhythm has been converted into more 
meaningful data sets by Su and Wu 0], and studies on 
clustering patterns of these data points have been done 
to understand the fractal nature of music [1, [^] . Fractal 
dimension plots have also shown agreement with percep- 
tion density of events during a piece of music UM- The 
multifractal spectrum plot also seems to be a potential 
tool for computationally distinguishing various styles of 
music Q. The objective in this paper is to look for struc- 
ture in terms of correlations in peaks within the waveform 
and make an attempt to look for other statistical features 
than the ones described above that may be intrinsic to 
music. 

It has been found that in many natural non- 
equilibrium systems, the probability distribution of re- 
currence time of the peaks (or large events) follow power 
laws hence showing self-similarity properties 



mow po 

mm 



P{t) = At- 



(1) 



'Electronic mail: Vishnu.Sreekumar85@gmail.com 



where r is the interval between large events, and P(r) 
is the interval distribution. The above law is due to 
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the temporal correlations in the system. These features 
have been observed for earthquakes [l3[ , solar flares [3] , 
and many other phenomena. Verma et al. [TEl ] derived 
a universal scaling law connecting the sizes, recurrence 
time, and spatial interval of large events and applied 
it to voltage-dependent anion channels (VDAC) current 
time series. Inspired by such studies that analyze corre- 
lations in recurrence time of large events, in the present 
manuscript the same idea has been applied to music. P. 
Diodati and S. Piazza (l6l | had done a similar study with 
a focus on Gioacchino Rossinis La calunnia 'e un venti- 
ccllo from II barbiere di Siviglia. We probe the origin 
of the correlations by systematically varying the musical 
parameters. 

The organization of the paper is as follows: In Section 
II several time series of computer-generated music are 
analyzed. The results are described for the statistics of 
the duration between large peaks. Section III contains 
conclusions. 



II. ANALYSIS OF CONTROLLED RANDOM 
MUSIC 

Music time series are generated using a computer pro- 
gram and analyzed. The output of the program is in 
MIDI format, which in turn is converted into WAVE 
format for analysis. Short music files of 50 notes each 
are first generated and given random characteristics of 
rhythm (durations of notes) , melody (frequencies of the 
notes) and loudness (intensities with which the notes 
were played). This (50 notes) is probably too short to 
be considered as a significant piece of musical expression 
(though monophonic mobile phone ring tones could be 
only just almost as long). Music files are then generated 
with the same combinations but now with 300 notes and 
with a different (panflute) tone. This is closer to a unit of 
meaningful musical expression. Only monophonic music 
is studied here. 

These time series are created with varying degrees of 
randomness of intensities, frequencies and durations of 
notes. For example, one file has notes played with the 
same intensities, and same durations, but different fre- 
quencies. All other such combinations are studied. Prob- 
ability distributions are assigned from which the intensi- 
ties and frequencies are selected. Durations are randomly 
assigned but have to be chosen from a restricted set of du- 
rations defined corresponding to a 32nd note (demi semi 
quaver) as 2° times 100ms, 16th note (semi quaver) as 2 1 
times 100ms, 8th note (quaver) as 2 2 times 100ms, quar- 
ter note (crotchet) as 2 3 times 100ms, half note (minim) 
as 2 4 times 100ms and a whole note (semi breve) as 2 5 
times 100ms. 

We analyze computer-generated violin and panflute 
music time series by varying three parameters, frequen- 
cies, intensities and durations. These three parameters 
could either be random or held constant. The random 
frequencies are chosen from two different probability dis- 



tributions, quadratic random or uniform random. The 
quadratic distribution is given by 

P(f) = B(f - f min )(f max - f) (2) 

where B=10~ 4 , f is the frequency, / ml „ and f ma x are the 
minimum and maximum values of f. They are denoted 
by the numbers 36 and 96 respectively in the computer 
program. The number 60 denotes the 'middle C" which 
is about 261 Hz. This quadratic probability distribution 
peaks at the average value of f m in and fmax- Frequencies 
from the middle of this defined range of values are more 
often selected when the quadratic distribution is used. 
The uniform random distribution is defined for the same 
range of frequencies. All frequencies in the defined range 
are equally probable to be chosen when the uniform dis- 
tribution is used. In this paper, these two distributions 
are denoted by Qr and Ur respectively. The intensities 
are taken to be either constant, or random with quadratic 
distribution. Similarly durations are either constant or 
randomly chosen from the set defined in the previous 
paragraph. For brevity we denote the time series by a 
short phrase whose first two letters denote the nature of 
randomness of the frequencies, the third and the fourth 
letters denote the same for the intensities and durations 
respectively. Here S denotes "same" and R denotes "ran- 
dom" . See Table 1 and Table 2 for the representation of 
violin and panflute music time series. For example, a 
time series named QrRS contains notes of random fre- 
quencies (chosen from a quadratic distribution), random 
intensities and same frequencies. 



A. Violin plots 

Here we analyze the time series of the computer gener- 
ated violin music. The wave file is read and 1006970 data 
points are used for this analysis. The music files (violin) 
are 22 seconds long and contain 50 notes each. Hence the 
average duration of one note is about 440 ms. For the 
time series, a threshold of 1.5cr is chosen as a cut off for 
identifying the peaks or the large events. Here a is the 
standard deviation. After this, the interval distribution 
is computed. Fig. 1-4 illustrate the distributions for four 
different cases of Table 1. In Fig. 1-4, unit time interval 
is 22/1006970 = 21.8 /^seconds. 

The plots exhibit the power law behavior for the re- 
currence interval between the peaks. The slopes range 
from -1.6 to -2.7. Note that P(t) ~ r~ 2 implies 1/f spec- 
tral density when the amplitude of the peaks are uniform 
[Tt| . Since the slopes described here are close to -2, these 
results are in general agreement with 1/f spectral densi- 
ties reported by Voss and Clarke. For completeness, in 
Tables 1 and 2 we also report R 2 that denotes the pro- 
portion of the variation that is explained by the model, 
which is the power law model in this case. 
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TABLE I: Slopes and coefficients of determination (-R 2 ) for 
the bi-logarithmic plots of the probability distribution func- 
tion of time interval between large signal values for the violin 
wavefiles. 



Violin Time Series 


Slope 


R 2 


Fig.l QrSS 


-2.0 


0.7362 


Fig.2 UrSS 


-1.6 


0.6905 


Fig.3 UrRS 


-2.5 


0.9605 


Fig.4 UrSR 


-2.7 


0.9605 
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FIG. 3: UrRS 



FIG. 1: QrSS 
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FIG. 4: UrSR 



law behavior and the slopes range from -1.5 to -2.2. 



TABLE II: Slopes and coefficients of determination (R 2 ) for 
the bi-logarithmic plots of the probability distribution func- 
tion of time interval between large signal values for the pan- 
flute wavefiles. 



Time Interval (z) 


Panflute Time Series 


Slope 


R 2 




Fig.5 QrSS 


-1.5 


0.6501 




Fig.6 UrSS 


-1.9 


0.7642 


FIG. 2: UrSS 


Fig.7 UrRS 


-2.2 


0.9970 



B. Panflute plots 

Panflute music time series are studied here. 6978845 
data points are used for the analysis. The computer gen- 
erated panflute music files are 156 seconds long and con- 
tain 300 notes each. Hence the average duration of one 
note is about half a second. As in the case for the violin 
time series, a threshold of 1.5(7 is chosen as cutoff and the 
peaks are identified. Fig. 5-7 illustrate the interval distri- 
butions for three cases of Table 2. Value corresponding to 
unit time interval in the Fig 5-7 is 156/6978845 = 22.35 
/^seconds. The interval distributions exhibit the power 



In the next subsection, we probe the role of the var- 
ious musical attributes in giving rise to the power law 
correlations. 



C. Repeated notes 

To understand the origin of the power law correlations, 
the three factors, melody, rhythm and loudness are ran- 
domized separately. Frequencies of the notes are held 
constant (261 Hz) throughout this study while intensi- 
ties and durations are randomized systematically. The 
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FIG. 5: QrSS 
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FIG. 6: UrSS 



FIG. 8: SSS 
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FIG. 9: SSR 



FIG. 7: UrRS 




power law disappears completely when all three param- 
eters are held constant, i.e, when the notes are repeated 
50 times with the same intensities and durations (Fig. 
8). Randomizing durations alone (Fig. 9) does not help 
restore the power law behavior but the power law reap- 
pears when randomization of durations is coupled with 
randomization of intensities (Fig. 11). Randomizing in- 
tensities alone, while holding both frequencies and dura- 
tions constant also restores the power law behavior (Fig. 



FIG. 10: SRS 

In other words, music time series of notes played with 
constant pitch (no "melody"), constant loudness (same 
intensities) but random rhythm (durations) do not ex- 
hibit the power law behavior. Music time series of notes 
played with constant pitch (no "melody" ) , constant loud- 
ness (same intensities) and constant durations also do 
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FIG. 11: SRR 

not exhibit this power law behavior. This study points 
towards the possibility that random characteristics of 
melody and loudness play an important role in the ori- 
gin of this power law behavior of interval distributions in 
music while randomness of durations does not. 



III. Conclusions 

In this paper, the recurrence interval between peaks in 
music time series is analyzed. Power law behavior P(r) <~ 
is observed with the index (3 varying from 1.5 to 2.7. 
The result is interesting in the light of the 1// spectral 
density reported earlier by Voss and Clarke. It has been 



shown earlier that a time series with uniform amplitudes 
and a r recurrence time distribution shows 1// spec- 
tral density. Hence our results on interval distribution 
arc in general agreement with the 1/f spectral density 
for music. 

An attempt is made to study the origin of the above 
correlations in music. The role of melody, loudness vari- 
ations, and rhythm in giving rise to these correlations 
is studied. The results seem to indicate that random- 
ness characteristics of melody and loudness variations 
contribute more to the origin of the power law behav- 
ior than randomness of rhythm (or durations). 

It is interesting to observe that the power law regime 
ends at about 20 milliseconds in all the cases studied. 
The origin of this cutoff may shed interesting light on the 
temporal correlations in music. The correlations studied 
in this paper are only between nearest peaks. One could 
study higher order correlations between more than two 
peaks. 
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